AITopics | recall performance

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.43)

Neural Information Processing SystemsFeb-16-2026, 12:07:56 GMT

ac112e8ffc4e5b9ece32070440a8ca43-Paper-Conference.pdf

information retrieval, machine learning, natural language, (20 more...)

Country:

Asia > China (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Neural Information Processing SystemsDec-24-2025, 22:07:29 GMT

A Neural Corpus Indexer for Document Retrieval

Current state-of-the-art document retrieval solutions mainly follow an index-retrieve paradigm, where the index is hard to be directly optimized for the final retrieval target. In this paper, we aim to show that an end-to-end deep neural network unifying training and indexing stages can significantly improve the recall performance of traditional methods. To this end, we propose Neural Corpus Indexer (NCI), a sequence-to-sequence network that generates relevant document identifiers directly for a designated query. To optimize the recall performance of NCI, we invent a prefix-aware weight-adaptive decoder architecture, and leverage tailored techniques including query generation, semantic document identifiers, and consistency-based regularization. Empirical studies demonstrated the superiority of NCI on two commonly used academic benchmarks, achieving +21.4% and +16.8% relative enhancement for Recall@1 on NQ320k dataset and R-Precision on TriviaQA dataset, respectively, compared to the best baseline method.

document retrieval, name change, neural corpus indexer, (6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

Neural Information Processing SystemsNov-21-2025, 15:17:09 GMT

A simple model of recognition and recall memory

artificial intelligence, recognition and recall memory, simple model, (7 more...)

Technology: Information Technology > Artificial Intelligence (0.40)

arXiv.org Artificial IntelligenceNov-3-2025

Understanding and Enhancing Mamba-Transformer Hybrids for Memory Recall and Language Modeling

Lee, Hyunji, Yu, Wenhao, Zhang, Hongming, Ma, Kaixin, Kim, Jiyeon, Yu, Dong, Seo, Minjoon

Hybrid models that combine state space models (SSMs) with attention mechanisms have shown strong performance by leveraging the efficiency of SSMs and the high recall ability of attention. However, the architectural design choices behind these hybrid models remain insufficiently understood. In this work, we analyze hybrid architectures through the lens of memory utilization and overall performance, and propose a complementary method to further enhance their effectiveness. We first examine the distinction between sequential and parallel integration of SSM and attention layers. Our analysis reveals several interesting findings, including that sequential hybrids perform better on shorter contexts, whereas parallel hybrids are more effective for longer contexts. We also introduce a data-centric approach of continually training on datasets augmented with paraphrases, which further enhances recall while preserving other capabilities. It generalizes well across different base models and outperforms architectural modifications aimed at enhancing recall. Our findings provide a deeper understanding of hybrid SSM-attention models and offer practical guidance for designing architectures tailored to various use cases. Our findings provide a deeper understanding of hybrid SSM-attention models and offer practical guidance for designing architectures tailored to various use cases.

large language model, machine learning, natural language, (18 more...)

2510.26912

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Construction & Engineering (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.84)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Neural Information Processing SystemsOct-9-2025, 04:26:23 GMT

Model-enhanced Vector Index

The document retrieval stage retrieves candidate documents related to the query, while the ranking stage re-ranks the documents using precise ranking scores.

information retrieval, machine learning, natural language, (20 more...)

Country:

Asia > China (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceAug-26-2025

Kernel Ridge Regression for Efficient Learning of High-Capacity Hopfield Networks

Tamamori, Akira

Hopfield networks using Hebbian learning suffer from limited storage capacity. While supervised methods like Linear Logistic Regression (LLR) offer some improvement, kernel methods like Kernel Logistic Regression (KLR) significantly enhance storage capacity and noise robustness. However, KLR requires computationally expensive iterative learning. We propose Kernel Ridge Regression (KRR) as an efficient kernel-based alternative for learning high-capacity Hopfield networks. KRR utilizes the kernel trick and predicts bipolar states via regression, crucially offering a non-iterative, closed-form solution for learning dual variables. We evaluate KRR and compare its performance against Hebbian, LLR, and KLR. Our results demonstrate that KRR achieves state-of-the-art storage capacity (reaching a storage load of 1.5) and noise robustness, comparable to KLR. Crucially, KRR drastically reduces training time, being orders of magnitude faster than LLR and significantly faster than KLR, especially at higher storage loads. This establishes KRR as a potent and highly efficient method for building high-performance associative memories, providing comparable performance to KLR with substantial training speed advantages. This work provides the first empirical comparison between KRR and KLR in the context of Hopfield network learning.

artificial intelligence, krr, machine learning, (16 more...)

2504.12561

Country:

Asia > Japan (0.40)
North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.61)

arXiv.org Artificial IntelligenceJul-10-2025

A Systematic Analysis of Hybrid Linear Attention

Wang, Dustin, Zhu, Rui-Jie, Abreu, Steven, Shan, Yong, Kergan, Taylor, Pan, Yuqi, Chou, Yuhong, Li, Zheng, Zhang, Ge, Huang, Wenhao, Eshraghian, Jason

Transformers face quadratic complexity and memory issues with long sequences, prompting the adoption of linear attention mechanisms using fixed-size hidden states. However, linear models often suffer from limited recall performance, leading to hybrid architectures that combine linear and full attention layers. Despite extensive hybrid architecture research, the choice of linear attention component has not been deeply explored. We systematically evaluate various linear attention models across generations - vector recurrences to advanced gating mechanisms - both standalone and hybridized. To enable this comprehensive analysis, we trained and open-sourced 72 models: 36 at 340M parameters (20B tokens) and 36 at 1.3B parameters (100B tokens), covering six linear attention variants across five hybridization ratios. Benchmarking on standard language modeling and recall tasks reveals that superior standalone linear models do not necessarily excel in hybrids. While language modeling remains stable across linear-to-full attention ratios, recall significantly improves with increased full attention layers, particularly below a 3:1 ratio. Our study highlights selective gating, hierarchical recurrence, and controlled forgetting as critical for effective hybrid models. We recommend architectures such as HGRN-2 or GatedDeltaNet with a linear-to-full ratio between 3:1 and 6:1 to achieve Transformer-level recall efficiently. Our models are open-sourced at https://huggingface.co/collections/m-a-p/hybrid-linear-attention-research-686c488a63d609d2f20e2b1e.

large language model, machine learning, natural language, (19 more...)

2507.06457

Country: North America > Mexico > Gulf of Mexico (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Neural Information Processing SystemsJan-18-2025, 08:28:54 GMT

A Neural Corpus Indexer for Document Retrieval

Current state-of-the-art document retrieval solutions mainly follow an index-retrieve paradigm, where the index is hard to be directly optimized for the final retrieval target. In this paper, we aim to show that an end-to-end deep neural network unifying training and indexing stages can significantly improve the recall performance of traditional methods. To this end, we propose Neural Corpus Indexer (NCI), a sequence-to-sequence network that generates relevant document identifiers directly for a designated query. To optimize the recall performance of NCI, we invent a prefix-aware weight-adaptive decoder architecture, and leverage tailored techniques including query generation, semantic document identifiers, and consistency-based regularization. Empirical studies demonstrated the superiority of NCI on two commonly used academic benchmarks, achieving 21.4% and 16.8% relative enhancement for Recall@1 on NQ320k dataset and R-Precision on TriviaQA dataset, respectively, compared to the best baseline method.

document retrieval, neural corpus indexer, recall performance, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)

arXiv.org Artificial IntelligenceNov-28-2024

DENIAHL: In-Context Features Influence LLM Needle-In-A-Haystack Abilities

Dai, Hui, Pechi, Dan, Yang, Xinyi, Banga, Garvit, Mantri, Raghav

The Needle-in-a-haystack (NIAH) test is a general task used to assess language models' (LMs') abilities to recall particular information from long input context. This framework however does not provide a means of analyzing what factors, beyond context length, contribute to LMs' abilities or inabilities to separate and recall needles from their haystacks. To provide a systematic means of assessing what features contribute to LMs' NIAH capabilities, we developed a synthetic benchmark called DENIAHL (Data-oriented Evaluation of NIAH for LLM's). Our work expands on previous NIAH studies by ablating NIAH features beyond typical context length including data type, size, and patterns. We find stark differences between GPT-3.5 and LLaMA 2-7B's performance on DENIAHL, and drops in recall performance when features like item size are increased, and to some degree when data type is changed from numbers to letters. This has implications for increasingly large context models, demonstrating factors beyond item-number impact NIAH capabilities.

benchmark, large language model, machine learning, (20 more...)

2411.1936

Country:

North America > United States > New York (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)